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Abstract — In this paper, we quantify liow much codes can 
reduce the data retrieval latency in storage systems. By combining 
a simple linear code with a novel request scheduling algorithm, 
which we call Blocking-one Scheduling (BoS), we show analyti- 
cally that it is possible to reduce data retrieval delay by up to 17% 
over currently popular replication-based strategies. Although in 
this work we focus on a simplified setting where the storage 
system stores a single content, the methodology developed can 
be applied to more general settings with multiple contents. The 
results also offer insightful guidance to the design of storage 
systems in data centers and content distribution networks. 



I. Introduction 

In today's data centers, one of the most demanding tasks 
(in terms of latency) is "disk-read," e.g., fetching the data 
for performing analytics such as MapReduce, or to serve the 
data to the end consumer. In many cases, this task is greatly 
complicated by the highly non-uniform data popularity, where 
the most popular data objects can be accessed ten times more 
frequently than the bottom third Ul. This skewed demand 
leads to high contention for read tasks of the most popular 
data. To meet the demand and reduce data retrieval latency, 
current systems often introduces data redundancy by replicat- 
ing many copies of each content, e.g., Hadoop replicates each 
content three or more times, to make the popular data more 
available and relieve the hot spots of read contentions, thereby 
reducing the average request latency. 

This motivates us to investigate the fundamental role of 
redundancy in improving the system latency. In particular, we 
compare two systems, one using codes and the other using 
simple replication. Heuristically, codes offer more flexibility 
when retrieving the data from the servers, thus may improve 
content retrieval latency. In this paper, we use a non-asymptotic 
analytical approach to quantify this intuition. 

To make the problem more concrete, consider an abstracted 
example of retrieving a file in a data center shown in Fig. [T] 
The storage system consists of 4 storage units, each capable 
of storing 1 packet of the desired file that consists of 2 packets 
A and B. We consider two possible storage strategies 1) each 
packet is replicated two times; 2) file is encoded using a (4, 2) 
Maximum-Distance-Code (MDS) code. It is easy to see the 
redundancy factor is 2 in both cases. There is a common 
dispatcher that queues and schedules the incoming requests. 
We assume the request process is Poisson and the service time 
of the storage unit is exponentially distributed. 

Quantifying the exact request delay in a coded system is 
a challenging task. The main difficulty is that the schedul- 
ing algorithm needs to remember which SUs served earlier 
requests to ensure that the request of a particular content 
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is always served by distinct SUs. This makes the analysis 
extremely difficult. Moreover, since data centers cannot afford 
a redundancy factor of more than a few tens at most, the 
asymptotic analysis based approach advocated in |[3l Q are 
not relavent. 

To tackle this challenge, we develop a novel scheduling 
algorithm called Blocking-one Scheduling (BoS). The idea 
is to block some subsequent content retrieval requests until 
the head-of-line request is processed. This helps remove the 
dependency in the scheduling actions, and allows a clean 
analysis of the delay performance of system using codes. 
Fortunately, as we will see later, the fraction of throughput 
loss due to BoS is 0(l/r^), where r is the redundancy factor 
Indeed, even for r = 2 in our simple example, the BoS 
achieves 96% of the maximum system throughput. 

Under the above settings, we analytically show that strategy 
2 that uses codes reduces the average request delay, for the 
considered example, by 7% or more compared to strategy 
1 that uses replication. The intuition behind this is that 
in the replication system, the requests for packet A ot B 
can be satisfied by only two specific servers, while in the 
coded system any two servers are sufficient to serve the file. 
Therefore, codes offer more flexibility in data retrieval due to 
multiplexing gain of available servers. 
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Fig. 1. An example information storage system storing a file that consists 
of two packets A and B. Requests for contents are assigned by a central 
dispatcher to available storage units. In order to reduce the content retrieval 
latency, we can either use replication, i.e., store the packets A, B twice, or 
coding, i.e., store coded packets of A and B. 

Related Work: The authors in [Sj and [11] study the 
throughput-delay gains of network coding in a single hop 
wireless downlink with unreliable channels. The authors of 
|8| consider the network coding gains in throughput when 
packets have hard delay deadlines. The work of |4| studies 
the gains in delay-throughput when using network coding over 
a linear network with unreliable links. In flOl . the authors 
use network coding to prevent underflow in a multi-media 
streaming application. While most of the earlier works focus 
on use of codes to achieve the best effort throughput and 
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or delay over unreliable wireless channels, non-asymptotic 
and theoretical analysis of queueing delay in coded systems 
remains largely unaddressed, to the best of our knowledge. 

The paper is organized as follows. In Section [ll] we state 
our system model. In Section III we present the schemes that 
will be used in the uncoded and coded systems, and present 
the Blocking-one Scheduling (BoS) algorithm. We analyze 



the performance of BoS in Section IV We then conclude the 
paper in Section |V] 

II. System Model 

We consider an information storage system that consists 
of n homogeneous storage servers, called storage units. Each 
SU has a storage capacity of one, and is capable of serving 
each incoming content request in a time that is exponentially 
distributed with mean /i = 1. The system stores and serves a 
set of contents, denoted by C = {1,2, ...,C}. Each content 
is striped into k packets of size one. A total of k SU is 
needed to store a single content. In practice, for reliability 
and availabiUty purposes, a content is stored redundantly on 
multiple servers. 

Assume the content retrieval requests for each content c 
form a Poisson arrival process with rate Ac. Since a content 
is striped into k packets that are stored on distinct SUs, every 
request needs to be served at k servers with distinct packets 
in order to fully retrieve the desired content. We model this 
behavior by duplicating each arrived request into k packet- 
requests and by making sure that no two packet-requests are 
processed by servers with identical packets. 

There is a central dispatcher that delegates the incoming 
requests to the SUs. Upon arrival, the requests are first queued 
at the central dispatcher, and then sent to the servers once they 
become availably In such a scenario, we are interested in 
quantifying the reduction in average content delay between a 
coding-based system and a replication-based system. 

To illustrate the problem further, consider a simple example 
depicted in the Fig. |2] The system contains 4 storage units 
and 1 content. The content is striped into 2 packets A 
and B, and is stored with a redundancy factor of 2. Each 
arriving content request brings into the system two packet- 
requests, e.g., 3v4, 2>B. In Fig.|2|a) we show a system that uses 
replication to introduce redundancy, where packets A and B 
are replicated twice. In contrast. Fig. [2|b) shows a system 
that uses a MDS code of rate 0.5 to introduce the desired 
redundancy. The question we aim to answer in this paper is 
by how much can we reduce the average delay of a coding- 
based system over a replication-based one? 

In general, quantifying the delay performance of a system 
with multiple contents, multiple servers and multiple coded 
packets that are replicated multiple times can be quite chal- 
lenging. The main difficulty lies in the fact that the SUs 
now store coded data, thus the scheduling algorithm has to 
ensure that the requests for the same content are not served 
by a same SU. This means that the scheduling algorithm may 
need to remember at which SUs all the earlier requests are 

'Note that such shared queue models have also been widely used to study 
data center problems, e.g.,|7| (6). 
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Fig. 2. An example storage system with 4 storage units (SU). The letter 
inside the SU box corresponds to the packet it stores. Each request arrival 
brings two packet requests into the system, which must be served by 2 SUs 
with distinct packets. A single queue is maintained for all the requests, (a) 
shows the uncoded system, where the packet A requests are served by SU 1 
and 2, and the packet B requests are served by SU 3 and 4. (b) shows the 
coded system, where any two distinct SUs can serve the both packet requests 
A and B. 



served. With such a long memory, even defining an appropriate 
system state is very hard, let alone analyzing it. To make the 
analysis tractable, we focus on a small-sized problem with 
a single content, 2 packets, and n servers. We provide a 
theoretical upper bound on the average content retrieval delay 
performance of the coded system, which we show still beats 
that of a replication-based system with identical amount of SU 
resources. We believe that the methodology and algorithms 
developed in this paper will provide useful insights to more 
general cases, which are part of our ongoing work. 

III. Storage and scheduling schemes 

In this section, we present the storage and scheduling 
schemes for both uncoded and coded systems. We assume 
that both systems have identical resources: each system hosts 
a single content that is divided into k — 2 packets, and has 
n = 2r,r > 2 SUs, each capable of storing 1 packet. Each 
server can serve requests with rate of p — 1. Here r is a 
redundancy factor in the system for both content availability 
as well as reliability of the content. 

A. Uncoded system 

In this case, due to inherent symmetry of arrivals of sub- 
requests for packets A, B, r SUs store packet A and the 
remaining r store packet B. Then, whenever there is an idle 
SU that stores packet A, the packet A request from the head- 
of-line request is assigned to this server. The same happens 
for packet B requests. 

Under this setting, the uncoded system can be modeled as 
two M/M/r queueing systems (See Fig. |2|i. |^Now denote by 
Tii the steady-state probability that there are i packet requests 
in an M/M/r system, and denote p = We note that 
{tto, TTi, . . . , } can be computed as follows 1. 



y {rpr 
^ ml 

m=0 



{rpY 



1 



(rp) 



rn 



I TTo, V m e {1, 



1-p 



(1) 



(2) 



^Note that the two separate systems are indeed not independent, since the 
arrivals to the systems happen at the exact same time. However, this does not 
affect the analysis for the average request delay. 
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And the average packet request delay dp™^^^ can be computed 
by: 

EC30 

"packet ~ ^ • 

Although the uncoded system admits an easy analysis of the 
average packet request delay, we see that finding the average 
request delay can be quite challenging. However, as we will 
see in the coded system, the average request delay can be 
easily computed under our algorithm. 

B. Coded system: Blocking-One Scheduling (BoS) 

We now specify our coding and scheduling schemes for the 
coded system. We have n = 2r SUs and k — 2 packets A, B. 
We adopt a simple linear (n, 2) MDS code (any family of 
MDS code will do) to generate n encoded packets that are 
stored at each of the SU. Due to the MDS nature of the code, 
any request that is served at any two distinct SUs will be able 
to retrieve the full content, e.g., see Fig. |2|b). Under such a 
coding scheme, in order to minimize average packet request 
delay, a simple greedy scheduling strategy would be: queue all 
the requests in a single queue. Whenever there is an idle SU 
that can serve any request in the queue, assign the request to 
that SU. Although the suggested greedy scheme seems simple, 
it has inherent memory/dependency in scheduling the requests 
due to the use of codes. For example, if the first packet request 
of the i*-^ request, denoted by Rn, is served at SU j, then 
the greedy scheduler has to remember not to assign the other 
packet request Ri2 to SU j even if it becomes idle. This 
dependency builds infinite memory into the system, which 
makes it very challenging to exactly analyze the average delay 
performance of requests in the coded system. 

In order to resolve this difficulty, we propose a novel 
scheduler called Blocking-one Scheduling (BoS), for the 
coded system. The main idea of BoS is to break the memory 
in the scheduler by blocking the requests beyond the head- 
of-line request until both the packet requests of the head-of- 
line request are served. The BoS scheduler also corresponds 
to first-come-first-serve (FCFS). As we will see, the BoS 
algorithm not only greatly simplifies the analysis of the packet 
request delay, but also allows us to directly calculate the 
average request delay. 

Algorithm 1 Blocking-one Scheduling (BoS) 

1: At any time t, denote the set of idle SUs as 5idie, do: 

1) If iSidie 7^ (t>, assign the packet requests from the 
head-of-line request to an idle SU in Snx^ as follows: 

• If packet request 1 has not yet been assigned, 
assign it to the idle SU. 

• Else assign packet request 2 if corresponding 
packet request 1 was not served by the idle server 

2) If a packet request is assigned to the SU, change the 
state of SU from idle to busy and remove it from 
(Sidle, and remove the packet request from queue. 

3) Repeat step 1) until no further assignment can be 
made. 



Note that under BoS, it can happen that there exists an 
idle SU but no assignment is made even when the queue is 
non-empty. For example, when the free SU has served the 
packet request 1 of the head-of-line request and no other 
SU is idle. In this case, the requests beyond the head-of-line 
request are "blocked." Due to this blocking effect, there will 
be a throughput loss due to the lost scheduling opportunity. 
Fortunately, as we will see later, the fraction of throughput loss 
is 0{^) that goes to zero as the r increases. Indeed, even 
for r = 2, the BoS achieves 96% of the maximum system 
throughput. 

IV. Analyzing the BoS algorithm 

We now analyze the BoS algorithm by finding the steady- 
state distribution of the coded system under BoS. The ap- 
proach works as follows. We first derive the continuous-time 
Markov chain that captures the system evolution. Then, we 
analyze the Markov chain by carefully choosing a set of global 
balance equations that allow us to compute the steady-state 
distribution. 

A. The system evolution 

In this section, we present the Markov chain that models the 
system evolution. Towards that end, we first take a closer look 
at the state evolution of the example system in Fig|2](b) with 4 
SUs. Suppose the system is in a state as shown in Fig.[4](a). In 
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Fig. 4. System evolution of an example with 4 storage units (SU). (a) shows 
a state with 7 packet requests in the system with an aggregate service rate 
of 4/i. (b) shows the resulting state after the departure of R\2 from SU 2, 
denoted by (6, p), where the system "renews" itself and every queued requests 
can go to any SU. (c) shows the resulting state after the departure of Rz\ from 
SU 4, denoted by (6,g). In this case, one SU is "wasted" due to blocking, 
and the system serves requests with rate 3/i. 

this case, the 4 SUs are serving packet requests Rn, R12, 
and i?3i. There are three more packet requests i?32,i?4i and 
R42 in the queue. Thus, the total number of packet requests 
in the system is 7. We denote the state of the system by the 
total number of packet requests in the system i.e., state is 7. 

Now, if SU 2 completes its service, the packet request i?32 
can be assigned to SU 2. This results in a state (6,p) as 
shown in the Fig.|4](b), which we refer to as a "perfect-state." 
This state is called "perfect" because once in this state, past 
evolution is irrelevant in determining the future scheduling 
events and the total service rate is maximum i.e., 4/i. Note that 
the system will also enter (6,p) if either SU 1 or 3 completes 
its service. Hence, the transition rate from state 7 into (6,p) 
is 3/i. However, if starting from state 7 but SU 4 completes 
the service, then we cannot assign request R^2 to SU 4, since 
i?3i was served at SU 4. In this case BoS will block the 
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Fig. 3. The continuous-time Markov chain of the system with 2r storage units. The request arrival rate is A and the service rate of each storage unit is /j. 
Each state of the chain represents the number of packet requests in the system. The letters "p" and "g" represent the perfect and good states. The cuts are 
used to specify the global balance equations we will use in our analysis. 



requests i?4i,i?42 until R^2 gets assigned to some other SU. 
This results in a not so perfect state, which we denote by 
(6, g), i.e., good state with 6 packet requests in the system 
(see Fig|4] (c)). In this good state, the aggregate service rate 
is 3/j,. Note that different from (6,p), the system enters (6,5) 
from state 7 only when SU 4 finishes its service. Hence, this 
transition happens only with rate /i. 

Although BoS performs sub-optimally as compared to a 
system with codes that maintains an infinite memory and 
assigns Rn to SU 4 when it becomes idle, it permits analytical 
treatment of average request delay. It can be verified that this 
blocking situation appears only when the system is in a state 
that has more than 2r packet requests in the system, and the 
number of packet requests is even. Thus, to characterize the 
system state, we introduce the suffix "p" and "g," which stand 
for "perfect" and "good," for all the states with 2r+2m, m > 
packet requests in the system. When the number of packet 
requests in the system is less than 2r, we simply use the total 
number of packet requests in the system to denote the state of 
the system. 

Now that the system states have been defined, the Markov 
chain in Fig. |3] explains the evolution of the system under 
BoS. The chain can be understood as follows: 

• With rate A, there is an request arrival event, and two 
new packet requests are added to the system. If system 
was in a good state before arrival it remains in the good 
state after arrival and likewise for a perfect state. 

• For any state < 2r transition rate for service completion 
is equal to ji times the state. 

• For any odd state > 2r there are two outgoing transitions 
for service completion: 

1) With rate (2r — there is transition to an even 
perfect state. 

2) With rate /i, there is transition to an even good state. 

• From an even-perfect state > 2r, there is a service 
completion transition to an odd state with rate 2r/i. 

• From an even-good state > 2r, there is a service com- 
pletion transition to an odd state with rate (2r — 



B. Performance of BoS: Average packet request delay 

Here we present the performance results of BoS. To state 
the theorem, we first define a few notations: 

A A A(2r — 1)^ Xfi 



V = 



2r^ (2r/i)2 (2r - l)/i2r/i 
2r^(A+ (2r- 

-(2r - 1)^(A + 2r^) 



+ (A + 2r^i), 



2rfi 



^ A(A+(2r-l)A.) 

Pp — : Pg 



(2r-l)(A + (2r 
-A(A + 2r/i) 



/i " 2r/i 

Also, for / e {0, 2r — 1}, define a; as follows: 

ao = 1, ai = -ao, a; = — (a/_i + a/_2)- (5) 

Now denote by ttq, 7r2r-i, vtj^, 7rf,., TT2r+i, ■■■ the stationary 
distribution of the Markov chain in Fig. [3] where superscripts 
"p" and "g" stand for perfect and good states. We now have 
the following theorem. 

Theorem 1: Under BoS, we have the following: 

(a) The maximum rate the coded system can support is: 

0<A<r^(l V (6) 

^ V 8r2 - 4r + 1 y 

(b) If the system is stable, i.e., (j6]l is satisfied, the steady 
state probabilities can be computed by the following 
iterative process: 

1 — ry 



7I"0 



(1 - v)Y.fLa^ai 



+ a2r— 1 



TT; = aiTTQ, V/ e {1, 2r - 1}, 

1 



2r+2m- 



^2 + '^2r+2m-3) , 



'2r 



[Pp{T^2r-l + T^2r-2) + ^T^2r-2\ 



1 



^2r+2m ^ [(^pi'^2r+2m-2 + '''2r+2m-2 + ^2r+2m-l) 
Ip 



~^^2r+2m-2 ~ (2^ 

1-3 nnrl ■tt^ 



l)Mr+2m-2]> V TO > 1. 



7rf^ and Tr2r+2m similarly computed by replacing 

7p,/3p with 7g,/3g. O 
Proof: See Appendix A. ■ 
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Note that a system with 2r servers should ideally support any 
arrival rate < A < r/i. However, we see from equation 
(|6]l of Theorem [T] that, under BoS, the there is a loss in 
throughput of order 0{\). This loss quickly goes to zero 
as r increases. Thus, BoS indeed ensures high throughput of 
the coded system even for moderate values of r. Part (b) of 
Theorem [T| then provides an efficient way for analyzing the 
system. We also note that our results are not asymptotic and 
can be applied to systems of small sizes. 

Using the approach in Theorem [T] we can compute the 
average packet request delay in the system using the equation 
Q. We then compute the delay gain of coding by: 

Delay Gain = {d^' - d^X)/d';^^f- 0) 
Fig. |5] shows the delay gain for different values of r. We see 
that the gain is significant even for small r, e.g., gain is 13% 
for r = 4, and can be up to 17% when r — 10. Finally, 
we emphasize that Fig. [Slis obtained analytically, computed 
using results in Theorern^ We note that in Fig. [5] for every 
r value, delay gain is only plotted for arrivals that are within 
the capacity region of the system, i.e., (j6]l. 
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Fig. 5. Delay reduction of BoS over tlie replication scheme. When r = 4, the 
delay gain is around 13%, and when r = 10, the gain is 17%. The plots are 
generated using analytical results in Theorem^ The reason the gain decreases 
as the arrival rate increases is because BoS has a slightly smaller capacity 
region. Thus, if the rate is very close to the capacity region of BoS, the delay 
under BoS will be larger than that under the replication scheme. However, 
this happens only when the rate is very close to the capacity boundary. For 
most of the rates, codes achieve a significant delay reduction. 

C. Average request delay 

Notice that the above analysis allows us to derive the 
average packet request delay. However, in practice, we care 
more about the average request delay, which is defined as the 
average time it takes for both packet requests of a request to 
get served. Our following lemma shows that under BoS, the 
average request delay is roughly equal to the packet delay. 
This is a very desired feature not possessed by the uncoded 
system. 

Lemma 1: Let (ij:°q'^'' and oJp°cket ^le the average request 
delay and average packet request delay under BoS, then: 

(8) 
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"^req ^ 'Vpacket 



2r - 2 
1 
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.O 



(2r - 2)(2r - l)2r^ 2(2r - 1)^ ' 
Proof: See Appendix B. ■ 
We see from the lemma that as number of servers 2r gets 
large, the difference between the packet delay and the request 



delay roughly equals This is exactly the difference between 
the average of two requests service times and the maximum 
of them. Hence, it will appear under any scheduling policies 
regardless of using coding or not. Therefore, Lemma [T] shows 
that BoS ensures that the average packet latency is almost 
equal to the average request latency. This is a very important 
feature of the BoS algorithm. 

V. Conclusion 

In this paper, we pose a fundamental question of the role of 
codes in improving the latency of content retrieval in storage 
systems. The interplay between coding and queueing delay 
is of complex nature. As a first step to make progress on 
this complex problem we propose and analyze a simplified 
setting of a single content divided into two parts and served by 
multiple servers. We see that even in this simplified setting the 
exact analysis of queueing delay for the systems using codes 
is intractable. As a result we provide a sub-optimal scheduling 
algorithm called Blocking-one Scheduling (BoS) that allows 
us to theoretically quantify the gains in latency achieved by 
coded system as compared to a system that uses replication. 
The methodology we developed in this paper is applicable to 
a more general setting that allows splitting the content into 
more than 2 parts. Further generalizations of our work a) 
extending our scheduling algorithm to incorporate more than 
one blocking to improve the analytical gains in latency, and 
b) considering scenario of serving multiple contents, is part of 
our future work. 

Appendix A - Proof of Theorem [T] 

We now proof Theorem [T] by analyzing the Markov chain 
using a set of carefully chosen global balance equations, 
which we call "cuts." Our approach is to first compute ttq. 
Then starting from ttq, we iteratively compute all the other 
probabilities. 

Proof: (Theorem [T} First consider the sets that contain 
the (2r + 2m, p) and (27- + 2m, g) states, i.e., the Type 3 
cuts in Fig. [3] We note that if the system is stable, then the 
total transition rate going out from any set of states must be 
equal to the total rate going into them. Thus, we first have the 
following equation for the states {2r,p) and {2r,g): 

7rf^(A + 2r/^) +7rf^(A+ (2r- 1)/^) (9) 

= 7r2r-2A + 7r2r+l2r^. 

Then for states (2r + 2m, p) and (2r + 2m, g) with m > 1, 
we have: 

<+2™(A + 2rfi) + ^i+2rn(A + (2r - 1)^) (10) 
Summing (|9]l and ( [T0| over to = 1, 2, we get: 



m=0 
= A7r2r-5 



m=0 



(11) 



2r^ 5Z ^2r+2m+l- 
m=0 

Now consider the diagonal cuts, i.e., the Type 2 cuts, starting 
from states {2r,p) and {2r,g). We have: 
(7r2r-2 + 7r2r-i)A 7rf^2r/i + 7rf^(2r - l)/i, (12) 
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('^2r+2m + '^2r+2m + 7I'2r+2m+l)A (13) 

Summing ( [T2] i and ( 13 i over to = 0, 1, we get: 
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= 41- E-']- 

Using ([TT| and ([14]), we thus obtain: 

oo 2r-2 
2r^ E '^2r+2m+l = A[1 - ^ ' 
m=0 (=0 

We now try to first find ttq. Consider only the "p" states, we 
get: 

7rf^(A + 2r^) = 7r2.r-2A + 7r2r+i(2r - l)/i, (16) 

'^2r+2m(A + 2r//) = 7rP^+2m-2A + 7r2r+2m+l (2r - 1)^. (17) 

Summing ([T6]l and ( [T7] l over to > 1, we obtain: 

oo oo 

2r^ E '^2r+2m = A7r2r-2 + (2r - l)Ai E '^2r+2m+l- (18) 
m— m— 

Similarly, we can look at the "g" states and get: 

7ri(A + (2r-l)Ai) =^2r+iA^, (19) 

'^2r+2m(A + (2r - 1)/^) = 7rf^+2m-2A + 7r2r+2m+l/^. (20) 

Summing these up, we get: 



{2r-l)^iY.l:. 



a 

2r+2m 



m—O 



m=0 



r+2m+l- 



(21) 



Using ([15]), ([18]) and ([21]), we obtain: 



+2m + '''2r+2m 



T^2r+2m+l\ 



A A(2r-1)^ 



A/i 

2r^ ' (2r/^)2 ' (2r - l)fi2rfi 
A 

-7r2r-2- 



2r-2 

[1 - E 

i=0 



2r/x 

Therefore, we get: 

A A(2r-l)/i 



+ 



A/i 



2r/i (2r/i)2 (2r - l)/i2r/i 



A 



2r-2 



[1 - E ^22) 



(=0 



+ 7r2r-2 = [l ^ 7 ^i] - 7I'2r-l- 

2r/i '■ ■'^ ■' 



(=0 



We see that ( 22 1 provides one equation in terms of only 



ttq, ...,TT2r-i- Below we show that all the probabilities 
TTi, 7r2r-i can be expressed in terms of ttq. In this case 
( [22] l will allow us to compute ttq exactly. This in turn enables 
us to compute tti, 7r2r-i- To do so, we first consider the 
type 1 and type 2 cuts shown in Fig. [3] to get: 

TTi = -TTo, (23) 

7i'2i = — (7r2i-2 + 7r2i_i), V z G { 1 , . . . , r - 1}, (24) 



2iji 



A 



7r2i+i 



(2i + l)/i 



{^T2^-l+7T2^),y^e{l,...,r-l}. (25) 



Using ( [231 ), ( [24[ ) and ( [25| ), one can obtain: 



amo, V;e{0,...,2r-1}, 



(26) 



where {a/, Z — 0, 2r— 1} are defined as in ([sl. Plugging (26 1 

back into (liil and denote V= M2r-iTi ^ v 

we have: 



(2r^)2 



(2r-l);i2r/i' 



1 



2r-2 

^ a/TTo = T] 
1=0 

Therefore: 

TTo = 



2r-2 
1] ^ a;7ro 



1 — rj 



(1 - '7)E?In^a/ 



Aa2r 



a2r-27ro + a2r-l71'0- 



(27) 



+ a2r-l 



^i=0 ' 2rp 

It is not difficult to verify that ttq is a valid probability if 
1 - 7y > 0, i.e., 

A A(2r-l)/i A/i 

2r^i ^ (2r/i)2 ^ (2r - l)/i2r/i 

This implies that the supportable rate is: 

1- f / 1 

A < r/t . . = r/i 1 



< 1. 



1 _ _L + ■ V 8r2 - 4r + 1 

the 

we see that BoS only lose a fraction 



(28) 



(29) 



1 

2r ' 8r2 

Here r/i is the total rate the system can ever support. Hence, 

8,.2_4^+i - This proves 
Part (a). 



To prove Part (b), we first see that one can now use (26 1 

to compute TTq, ■K2r-1- To compute T^2r+2m^''^2r+2m- 

TO > 0, we start from m — Q. We have from ( [T2] i that: 

7rf^2r/i + 7rf^(2r - l)/i = A(7r2r-i + vr2r-2)- (30) 
Now if we look at the state (2r, p) and {2r,g) separately, we 
get: 

7rf^(A + 2r/i) = 7r2r-2A + Tr2r+i(2r - l)/i, (31) 
^Tl{X + {2r~l)^l) = ^2r+iAi. (32) 
Canceling the term 7r2r+i in ( [3T) and ([32]), we get that: 
TT^^{X + 2r/i) - ^f,(2r - 1)(A + (2r ~ l)/i) = A7r2._2. (33) 



With (30i and (33l, we can now compute 7r2^,7r2^. To make 

the expressions more concise, we define: 

^ 2rMA_+^2r^^ 
7p = : ^ (A + 2r/i), 



7s 



/3p 



-(2r - l)/i(A + 2r/i) 



2r/i 

^ A(A+(2r-l)/i) 



Then we get: 

p 



(2r-l)(A + (2r-l)M), 
A(A + 2r/i) 



7p 
1 



[Pp{T^2r 



2rfj, 

7r2r-2) + A7r2r-2] 



(34) 
(35) 



Trfr = — [/3g(7I'2r-l + 7r2r-2) + A7r2,._2] 

7g 

Now for all the states (2r + 2m, p) and (2r + 2m, g) with 
TO > 1, using ( [T3] ), ( [TT] ) and ( [20[ l, we get: 



2r+2m-2 ' 



'2r+2m-2 



(A + 2r/i)^f,+2™ - (2r - 1)(A + (2r - 1)/^)^ 



'2r+2m 
\ P 

'^''''2r+2m-2 



7r2r+2m-lj, 
r 

:r+2m 



(2r 



l)A7rfr+2m-2- 



We can thus obtain the following equations for all states 2r 
2m, m > 1:|3 

1 



'2r+2m 



7p 



+2m-2 



'2r+2m-2 



T^2r+2m-l 



^It can be verified that both ''^2r+2m ^^'^ ''^2r+2m positive. Thus 

they are valid probabilities. 



-2 - i'^r - l)A7rf^+2m-2] > 



1 



2r+2m — [/^9 (■'''2r+2m-2 + ^2r+2m-2 + ''^2r+2m-l ) 

+ Mr+2m-2 " {"^^ - 1) A7rf^^2m-2] ■ 

Then, the probabiUties 7r2r+2m-i with m > 1 can be com- 
puted using type 1 cuts, i.e., 

2r^7r2r+2m-l = K'^lr+2m-2 + ■^2r+2m-2 + '^2r+2m-d,)- (36) 

With all the above results, the average packet request delay in 
the system can be computed as: 

^coded ^ 

packet 2^ 



1=1 



Itti + {2r + 2m+ l)TT2r+ 



m>0 



2m+l 



+ ^ (2r + 2TO)(7rf^+2m + 1'2r+2m) 
m>0 



This completes the proof of Part (b). 



Appendix B - Proof of Lemma[T] 

Here we prove Lemma [T] 

Proof: (Lemma [TJ Consider any request u that enters and 
departs from the system. Let Wui and Wu2 be the waiting times 
of its first and second packet requests in the queue. Then let 
Sui and Su2 be the service times of the two packet requests. 

We now derive a relationship between Wui and w„2- Ac- 
cording to BoS, packet request 1 always goes before packet 
request 2, thus Wui < Wu2- Now suppose packet request 1 
goes to a storage unit j at a time t. Then, the extra waiting 
time of packet request 2 in the queue is exactly the time it 
takes for any of the other 2r — 1 storage units to be free. By 
the exponential service time nature, this time is exponentially 
distributed with rate (2r — i.e., 

Wu2 = Wui+Tu, where T„ ~ exp((2r - 1)^). (37) 
Now let tui and tu2 be the total times packet request 1 and 2 
stay in the system, and let t„ be the time the request spends 
in the system, we have: 

tui = w„i+s„i, (38) 

tu2 = Wu2+Tu + Su2, (39) 

tu = w„i + max(s„i, s„2 + T„). (40) 
max(s„i, Su2 + Tu). It can be verified that: 



Denote Z = 

fziz) = 



2r- 1 



fie 



(2r - (2i— 



2r- 2 



2r-2 



2r - 1 



and: 



E 



Z 



1 



2r- 
2r - 1 1 



-2^e 



-2/^2 



2r 



-Lie 



-2rfiz 



1 



(41) 



/i 2r-2 2fi (2r - 2)(2r - l)2r/i' 
It thus follows that the difference between the average request 
delay and the average packet request delay under BoS is given 
by: 



-^req 



jcoded 
Vpacket 



= E 



1 



- -(E 



tui 



E 



tu2 



2r - 1 1 



2r - 2 2/i (2r 
This completes the proof. 



2)(2r-l)2r/z 2(2r-l)/z' 



References 

[1] G. Ananthanarayanan, S. Agarwal, S. Kandula, A Greenberg, and 
I. Stoica. Scarlett: Coping with skewed content popularity in mapreduce 
clusters. Proceedings of ACM EuroSys, 2011. 

[2] D. P Bertsekas and R. G. Gallager Data Networks (2nd Edition). 1992. 

[3] M. Bramson, Y. Lu, and B. Prabhakar. Randomized load balancing with 
general service time distributions. Proceedings of ACM Sigmelrics, 2010. 

[4] T. Dikaliotis, A. G. Dimakis, T. Ho, and M. Effros. On the delay of 
network coding over line networks. IEEE International Symposium on 
Information Theory (ISIT), 2009. 

[5] A. Eryilmaz, A. Ozdaglar, M. Medard. and Ebad Ahmed. On the 
delay and throughput gains of coding in unreliable networks. IEEE 
Transactions on Information Theory, Dec 2008. 

[6] A. Gandhia and M. Harchol-Baltera. How data center size impacts the 
effectiveness of dynamic power management. Proc. of the Forty-Ninth 
Annual AUerton Conference, 2011. 

[7] A. Gandhia, M. Harchol-Baltera, and I. Adan. Server farms with setup 
costs. Performance Evaluation.Voliime 67 Issue II, Nov 2010. 

[8] X. Li, C.-C. Wang, and X. Lin. Throughput and delay analysis on 
uncoded and coded wireless broadcast with hard deadline constraints. 
Proceedings of IEEE INFOCOM Mini-Conference, March 2010. 

[9] Y. Lu, Q. Xie, G. Kliot, A. Geller, J. Larus, and A. Greenberg. Join- 
idle-queue: A novel load balancing algorithm for dynamically scalable 
web services. 29th International Symposium on Computer Performance, 
Modeling, Measurements, and Evaluation (Performance), 2011. 
[10] A. ParandehGheibi, M. Medard, A. Ozdaglar, and S. Shakkottai. Avoid- 
ing interruptionsa qoe reliability function for streaming media applica- 
tions, to appear in IEEE Journal on Selected Areas in Communications, 
special issue on Trading rate for Delay at the Transport and Application 
layers, 2012. 

[11] W. Yeow, A. Hoang, and C. Tham. Minimizing delay for multicast- 
streaming in wireless networks with network coding. Proceedings of 
IEEE INFOCOM, 2009. 



